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This paper describes ongoing research into the application of machine learning techniques 
for improving access to governmental information in complex digital libraries. Under the 
auspices of the GovStat Project, our goal is to identify a small number of semantically valid 
concepts that adequately spans the intellectual domain of a collection. The goal of this 
discovery is twofold. First we desire a practical aid for information architects. Second, 
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The focus of access control in client/server environments is on protecting sensitive server 
resources by determining whether or not a client is authorized to access those resources. 
The set of resources is usually static, and an access control policy associated with each 
resource specifies who is authorized to access the resource. In this article, we turn the 
traditional client/server access control model on its head and address how to protect the 
sensitive content that clients disclose to and r ... 
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As the World Wide Web evolves into an innnriense information network, It is tempting to 
build new digital library services and expand existing digital library services to make use of 
web content. In this paper, we present the design and implementation of G-Portal, a web 
portal that aims to provide digital library services over geospatial and georeferenced 
content found on the World Wide Web. G-Portal adopts a map-based user interface to 
visualize and manipulate the distributed geospatial and georef ... 
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This paper describes a system for question answering using semi-structured metadata, 
QuASM (pronounced "chasm"). Question answering systems aim to improve search 
performance by providing users with specific answers, rather than having users scan 
retrieved documents for these answers. Our goal is to answer factual questions by 
exploiting the structure inherent in documents found on the World Wide Web (WWW). 
Based on this structure, documents are indexed into smaller units and associated with 
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At the present time, several shortcomings prevent the more effective use and more intense 
application of web information systems. Recent developments that are subsumed by the 
term Semantic Web aim to solve these problems. The inherent idea behind these 
approaches is the annotation of data with metadata, in order to enhance automated 
processing and the use of ontologies to describe data semantically. However, the 
emergence of the Semantic Web raises new issues (e.g. significantly higher complexit ... 
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We present an architecture optimization technique called dlvide-and-concatenate for 
universal hash functions. The area of a multiplier increases quadratically and its speed 
increases gradually with the operand size and two universal hash functions are equivalent if 
they have the same collision probability property. Based on these observations, the divide- 
and-concatenate approach divides a 2w-bit data path (with collision probability 2-2w) into 
two w-bit data paths (each with collision probabilit ... 
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Semantic context detection is one of the key techniques to facilitate efficient nnultimedia 
retrieval. Semantic context is a scene that completely represents a meaningful information 
segment to human beings. In this paper, we propose a novel hierarchical approach that 
models the statistical characteristics of several audio events, over a time series, to 
accomplish semantic context detection. The approach consists of two stages: audio event 
and semantic context detections. HMMs are used to model b ... 

Keywords: Gaussian mixture model, audio content analysis, audio retrieval, hidden Markov 
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This paper discusses applying facilities in SMIL 2.0 to the problem of annotating nnultimedla 
presentations. Rather than viewing annotations as collections of (abstract) meta-informa- 
tion for use in indexing, retrieval or semantic processing, we view annotations as a set of 
peer-level content with temporal and spatial relationships that are important in presenting a 
coherent story to a user. The composite nature of the collection of media Is essential to the 
nature of peer-level annotations: you ... 
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Daniel DeMenthon, David Doermann 

November 2003 Proceedings of the eleventh ACM international conference on 
Multimedia 

Full text available:^ pdf(994. 11 KB) Additional Information: full citation , abstract , references, index terms 

This paper describes a novel methodology for implementing video search functions such as 
retrieval of near-duplicate videos and recognition of actions In surveillance video. Videos are 
divided into half-second clips whose stacked frames produce 3D space-time volumes of 
pixels. Pixel regions with consistent color and motion properties are extracted from these 
3D volumes by a threshold-free hierarchical space-time segmentation technique. Each 
region is then described by a high-dimensional point wh ... 
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Multimedia 

Full text available: ^ pdf(339.39 KB) Additional Information: full citation , abstract , reference s, index terms 

Real-time content-based access to live video data requires content analysis applications 
that are able to process the video data at least as fast as the video data is made available 
to the application and with an acceptable error rate. Statements as this express quality of 
service (QoS) requirements to the application. In order to provide some level of control of 
the QoS provided, the video content analysis application must be scalable and resource 
aware so that requirements of timeliness and ac ... 

Keywords: QoS and resource management, event-based communication, parallel 
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Full text available: ^ pdf(429>65 KB) Additional Information: full citation , abstract , references , index terms 

Web personalization is the process of customizing a Web site to the needs of each specific 
user or set of users, taking advantage of the knowledge acquired through the analysis of 
the user's navigational behavior. Integrating usage data with content, structure or user 
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profile data enhances the results of the personalization process. In this paper, we present 
SEWeP, a systenn that makes use of both the usage logs and the semantics of a Web site's 
content in order to personalize it. Web content is ... 

Keyw rds: Web mining, Web personalization, concept hierarchies, semantic annotation of 
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October 2003 Proceedings of the 2003 ACM workshop on Digital rights management 
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Full text available: Wj pdf(285.80 KB) — : 

^ review 

Unauthorized copying of movies is a major concern for the motion picture industry. While 
unauthorized copies of movies have been distributed via portable physical media for some 
time, low-cost, high-bandwidth Internet connections and peer-to-peer file sharing networks 
provide highly efficient distribution media. Many movies are showing up on file sharing 
networks shortly after, and in some cases prior to, theatrical release. It has been argued 
that the availability of unauthorized copies directi ... 

Keywords: digital rights management, file sharing, insider attacks, multimedia, physical 
security, policy 
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I. Ganapathy, R. F. Hobson 

September 1976 Proceedings of the eighth international conference on APL 

Full text available: ^ pdf(763.50 KB) Additional Information: full citation , abstract , references , index terms 

GPMS is a keyword based information entry and retrieval systenn. The data consist of a 
collection of messages, where a message is an arbitrary quantum of free text. Users supply 
both keywords and message body. Keywords are hashed into an association file while 
messages, which receive unique chronological names, reside in a message file. A simple 
English-like query package is Included. GPMS has been used as a community information 
management system (advertisements, quotations, plea ... 



'1 7 Dynamic Access Control: An access control model for dynamic client-side content 
Adam Hess, Kent E. Seamons 

June 2003 Proceedings of the eighth ACM symposium on Access control models and 
technologies 

Full text available: ^ pdf(608.50 KB) Additional Information: full citation , abstract , references , index terms 

The focus of access control in client/server environments is on protecting sensitive server 
resources by determining whether or not a client is authorized to access those resources. 
The set of resources are usually static, and an access control policy associated with each 
resource specifies who is authorized to access the resource. In this paper, we turn the 
traditional client/server access control model on its head, and address how to protect the 
sensitive content that clients disclose to serve ... 

Keyw rds: access control, authentication, credentials, trust negotiation 
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Suhit Gupta, Gail Kaiser, David Nelstadt, Peter Grimm 

May 2003 Pr ceedings f the twelfth internati nal conference on World Wide Web 

Full text available- fi3 Ddff296 17 KB) Additional Information: full citation , abstract , references , citings , index 
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Web pages often contain clutter (such as pop-up ads, unnecessary images and extraneous 
links) around the body of an article that distracts a user from actual content. Extraction of 
"useful and relevant" content from web pages has many applications, including cell phone 
and PDA browsing, speech rendering for the visually impaired, and text summarization. 
Most approaches to removing clutter or making content more readable Involve changing 
font size or removing HTML and data components such as imag ... 

Keywords: DOM trees, HTML documents, accessibility, content extraction, reformatting, 
speech rendering 
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Alfred Kobsa, Jorg Schreck 

May 2003 ACM Transactions on Internet Technology (TOIT), volume 3 issue 2 

Full text available: jj l pdf(881 69 KB) Additional Information: fMciMion, a^sLract, refenences, ci!in.as, index 
^ ' terms , review 

User-adaptive applications cater to the needs of each individual computer user, taking for 
example users' interests, level of expertise, preferences, perceptual and motoric abilities, 
and the usage environment into account. Central user modeling servers collect and process 
the information about users that different user-adaptive systems require to personalize 
their user interaction. Adaptive systems are generally better able to cater to users the more 
data their user modeling systems collect and ... 

Keywords: Chaum mix, KQML, User modeling, access control, anonymity, encryption, 
personal information, personalization, privacy, pseudonymity, reference model, secrecy, 
security, user-adaptive systems 
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symposium on Computer science education, Volume 35 issue i 
Full text available" pdf(15Q 63 KB) Additional Information: full citatio n, abs tr act , references , ci t i n gs, index 
^ '' terms 

Writing skills need to be integrated Into the Computer Science (CS) curricuium, and there is 
little empirical evidence on how best to do so. This paper first describes a technical writing 
class for CS undergraduates. Then it presents the results of a statistical study that 
investigated student perceptions of their learning experience in three areas: skill mastery, 
self-efficacy, and motivation. Positive results support this approach to teaching writing to 
CS students. Some unexpected findings in ... 

Keywords: pedagogy, writing 
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The focus of access control in client/server environments is on protecting sensitive server 
resources by determining whether or not a client is authorized to access those resources. 
The set of resources are usually static, and an access control policy associated with each 
resource specifies who is authorized to access the resource. In this paper, we turn the 
traditional client/server access control model on its head, and address how to protect the 
sensitive content that clients disclose to serve ... 
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The focus of access control in client/server environments is on protecting sensitive server 
resources by determining whether or not a client is authorized to access those resources. 
The set of resources is usually static, and an access control policy associated with each 
resource specifies who is authorized to access the resource. In this article, we turn the 
traditional client/server access control model on its head and address how to protect the 
sensitive content that clients disclose to and r ... 
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User-adaptive applications cater to the needs of each individual connputer user, taking for 
example users' interests, level of expertise, preferences, perceptual and motoric abilities, 
and the usage environment into account. Central user modeling servers collect and process 
the information about users that different user-adaptive systems require to personalize their 
user interaction. Adaptive systems are generally better able to cater to users the more data 
their user modeling systems collect and ... 

Keywords: Chaum mix, KQML, User modeling, access control, anonymity, encryption, 
personal information, personalization, privacy, pseudonymity, reference model, secrecy, 
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As the World Wide Web evolves into an immense information network, it is tempting to build 
new digital library services and expand existing digital library services to make use of web 
content. In this paper, we present the design and implementation of G-Portal, a web portal 
that aims to provide digital library services over geospatial and georeferenced content found 
on the World Wide Web. G-Portal adopts a map-based user interface to visualize and 
manipulate the distributed geospatial and georef ... 
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This paper discusses applying facilities in SMIL 2.0 to the problem of annotating multimedia 
presentations. Rather than viewing annotations as collections of (abstract) meta-informa-tion 
for use in indexing, retrieval or semantic processing, we view annotations as a set of peer- 
level content with temporal and spatial relationships that are important in presenting a 
coherent story to a user. The composite nature of the collection of media is essential to the 
nature of peer-level annotations: you ... 
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At the present time, several shortcomings prevent the more effective use and more intense 
application of web information systems. Recent developments that are subsumed by the 
term Semantic Web aim to solve these problems. The inherent idea behind these approaches 
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^ A holistic approach to service survivability 

Angelos D. Keromytis, Janak Parekh, Philip N. Gross, Gail Kaiser, Vishal Misra, Jason Nieh, Dan 
Rubenstein, Sal Stolfo 

October 2003 Proceedings of the 2003 ACM workshop on Survivable and self- 
regenerative systems: in association with 10th ACM Conference on 
Computer and Communications Security 

Full text available: ^ „ pdf(1.58 MB) Additional Information: full citation, abstract, references , index terms 

We present SABER (Survivability Architecture: Block, Evade, React), a proposed 
survivability architecture that blocks, evades and reacts to a variety of attacks by using 
several security and survivability nnechanisms in an automated and coordinated fashion. 
Contrary to the ad hoc nnanner in which contemporary survivable systems are built-using 
isolated. Independent security mechanisms such as firewalls, intrusion detection systems 
and software sandboxes-SABER integrates several different techno ... 



Keywords: intrusion detection, overlay networks, survivability 



2 Labeling images with a computer game 
Luis von Ahn, Laura Dabbish 

April 2004 Proceedings of the 2004 conference on Human factors in computing 
systems 

Full text available: ^ pdf(493.67 KB) Additional Information: full citation , abstract , references , index terms 

We introduce a new interactive system: a game that is fun and can be used to create 
valuable output. When people play the game they help determine the contents of images by 
providing meaningful labels for them. If the game is played as much as popular online 
games, we estimate that most images on the Web can be labeled in a few months. Having 
proper labels associated with each image on the Web would allow for more accurate image 
search, improve the accessibility of sites (by providing descriptio ... 

Keywords: World Wide Web, distributed knowledge acquisition, image labeling, online 
games 




Calculating error rates for filtering software 

Paul J. Resnick, Derek L. Hansen, Caroline R. Richardson 

September 2004 c mmunicati ns f the ACI^, Volume 47 issue 9 
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Full text available: Q pdfd 34.44 KB) Additional Information: full citation , abstract , references , index terms 
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Surveys in the U.S. have found that 95% of schools [4], 43% of public libraries [5], and 
33% of teenagers' parents [8] employ filtering software to block access to pornography and 
other inappropriate content. Many products are also now available to filter out spann 
email. Filtering software, however, cannot perfectly discriminate between allowed and 
forbidden content, resulting in two types of errors. First, under-blocking occurs when 
content is not blocked that should be restricted. Second, over- ... 

Network security: Web tap: detectin g covert web traffic | 
Kevin Borders, AtuI Prakash 

October 2004 Proceedings of the 11th ACM conference on Computer and 
communications security 

Full text available: ^ pdf(1 29.06 KB) Additional Information: full citation , abstract , references , index terms 

As network security is a growing concern, system administrators lock down their networks 
by closing inbound ports and only allowing outbound communication over selected protocols 
such as HTTP. Hackers, in turn, are forced to find ways to communicate with compromised 
workstations by tunneling through web requests. While several tools attempt to analyze 
inbound traffic for denial-of-service and other attacks on web servers, Web Tap's focus is on 
detecting attempts to send significant amounts of ... 

Keywords: HTTP, anomaly detection, covert channels, intrusion detection, spyware 
detection, tunnels 



In fo r matio n protecti on nnethods: Disp l a y-o nly f i le s erv er: a solution again st i nforma t ion 
theft due to insider attack 
Yang Yu, Tzi-cker Chiueh 

October 2004 Proceedings of the 4th ACM workshop on Digital rights management 

Full text available: ^ pdf(311.80 KB) Additional Information: full citation , abstract , references , index terms 

Insider attack is one of the nnost serious cybersecuhty threats to corporate Annerica. Annong 
all insider threats, infornnation theft is considered the most dannaging in ternns of potential 
financial loss. Moreover, it is also especially difficult to detect and prevent, because in many 
cases the attacker has the proper authority to access the stolen information. According to 
the 2003 CSI/FBI Computer Crime and Security Survey, theft of proprietary information 
was the single largest category of los ... 

Keywords: access, digital rights management, information theft, insider attack 



6 Providing effective ICT services using open source technologies: University of the 
Philippines experience 
Rommel P. Feria 

October 2004 Proceedings of the 32nd annual ACM SIGUCCS conference on User 
services 

Full text available:^ pdf(51.44 KB) Additional Information: full citation , abstract , references , index terms 

The University of the Philippines, comprised of seven constituent universities, is the 
premiere State university of the country. Providing effective information and 
communications technology (ICT) to Its faculty, staff and students is critical in ensuring the 
delivery of quality education. 

Established in 1966, the University Computer Center (UCC) is the unit mandated to provide 
ICT to the university. From mainframe-based computer services, the UCC has migrated its 
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operations to worksta ... 

Keyw rds: University of the Philippines, information and communications technology, 
information technology, open source 



7 Industry/government track p a pers: TiVo: making show recommendations usin g a 
distributed collaborative filtering architecture 
Kama! Ali, Wijnand van Stam 

August 2004 Proceedings of the 2004 ACM SIGKDD international conference on 
Knowledge discovery and data mining 

Full text available: ^ pdf(810.92 KB) Additional Information: full citation , abstract , references , index terms 

We describe the TiVo television show collaborative recommendation system which has been 
fielded in over one million TiVo clients for four years. Over this install base, TiVo currently 
has approximately 100 million ratings by users over approximately 30,000 distinct TV 
shows and movies. TiVo uses an item-item (show to show) form of collaborative filtering 
which obviates the need to keep any persistent memory of each user's viewing preferences 
at the TiVo server. Taking advantage of TiVo's client- ... 

Keywords: clustering clickstreams, collaborative-filtering 



Innage Retrieval from the Wo rld Wide Web: Issues, Techniq ues, and Systems 
M. L. Kherfi, D. Ziou, A. Bernardi 

March 2004 ACM Computing Surveys (CSUR), Volume 36 issue l 

Full text available:'^ pdf(294. 13 KB) Additional Information: full citation , abstract , references , index terms 

With the explosive growth of the World Wide Web, the public Is gaining access to massive 
amounts of information. However, locating needed and relevant information remains a 
difficult task, whether the information is textual or visual. Text search engines have existed 
for some years now and have achieved a certain degree of success. However, despite the 
large number of images available on the Web, image search engines are still rare. In this 
article, we show that in order to allow people to profi ... 

Keywords: Image-retrieval, World Wide Web, crawling, feature extraction and selection, 
indexing, relevance feedback, search, similarity 



9 Posters: Content-based filtering and personalization using structured metadata 
A. Mufit Ferman, James H. Errico, Peter van Beek, M. Ibrahim Sezan 

July 2002 Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries 

Full text available: ^ pdf(111.36 KB) Additional Information: full citation , abstract , references , index terms 

Structured descriptions of multimedia content and automatically generated user profiles are 
used to filter content. 

Keywords: MPEG-7, XML, multimedia content, personalization, recommendation, user 
profile 



10 Virtual and augmented reality: Through the looking glass: the use of lenses as an 

interface tool for Augnnented Reality interfaces 
Julian Looser, Mark Billinghurst, Andy Cockburn 

June 2004 Pr ceedings of the 2nd international c nference n Computer graphics and 
interactive techniques in Australasia and Southe East Asia 
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Full text available: ^ pdf(430.34 KB) Additional Information: full citation , abstract , references , index terms 

In this paper we present new interaction techniques for virtual environments. Based on an 
extension of 2D MagicLenses, we have developed techniques involving 3D lenses, 
information filtering and semantic zooming. These techniques provide users with a natural, 
tangible Interface for selectively zooming in and out of specific areas of interest in an 
Augmented Reality scene. They use rapid and fluid animation to help users assimilate the 
relationship between views of detailed focus and global conte ... 

Keywords: Augmented Reality, MagicLenses, semantic zooming 



11 Mobility; Session level techniques for improving web browsing performance on 
wireless links 

Pablo Rodriguez, Sarit i^ukherjee, Sampath Ramgarajan 

May 2004 Proceedings of the 13th international conference on World Wide Web 

Full text available; ^ pdf(486.66 KB ) Additional Information: full citation , abstract , references . Index terms 

Recent observations through experiments that we have performed incurrent third 
generation wireless networks have revealed that the achieved throughput over wireless 
links varies widely depending on the application. In particular, the throughput achieved by 
file transfer application (FTP) and web browsing application (HTTP) are quite different. The 
throughput achieved over a HTTP session is much lower than that achieved over an FTP 
session. The reason for the lower HTTP throughput is that the HTT ... 

Keywords: optimizations, web, wireless 



^ 2 HEC Montreal: deployment of a lar g e-s ca le mail installation 
Ludovic Marcotte 

May 2004 Linux Journal, Volume 2004 issue 121 

Full text available: ^ html(16.83 KB) Additional Information: full citation , abstract 
If you thought you had mail problems, try 600,000 spams a day. 

Cross-lingual C*ST*RD: English access to Hindi information 

Anton LeuskI, Chin-Yew Lin, Liang Zhou, Ulhch Germann, Franz Josef Och, Eduard Hovy 
September 2003 ACM Transactions on Asian Language Information Processing (TALIP), 

Volume 2 Issue 3 

Full text available: ^ pdf(210.61 KB) Additional Information: full citation , abstract , references , index terms 

We present C*ST*RD, a cross-language information delivery system that supports cross- 
language information retrieval, information space visualization and navigation, machine 
translation, and text summarization of single documents and clusters of documents. 
C*ST*RD was assembled and trained within 1 month, in the context of DARPA's Surprise 
Language Exercise, that selected as source a heretofore unstudied language, Hindi. Given 
the brief time, we could not create deep Hindi capabilities for all th ... 

Keywords: Cross-language information retrieval, Hindi-to-English machine translation, 
headline generation, information retrieval and information space navigation, single- and 
multi-document text summarization 



14 Multinnedia and visualization (MV): Modelling and filterin g of MPEG-7-compliant meta- H 
data for digital video 
Harry Aglus, Marios C. Angelides 
March 2004 
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Proceedings of the 2004 ACM symp slum on Applied c mputing 

Full text available: ^ pdf{235.90 KB) Additional Information: full citation , abstract , references , index terms 

The recent MPEG-7 standard specifies a semi-structured nneta-data format for open 
interoperability of multimedia. However, the standard refrains from specifying how the 
meta-data is to be used or how meta-data inappropriate to user requirements may be 
filtered out. Consequently, we propose COSMOS-7, which produces structured MPEG-7- 
compliant meta-data for digital video and enables content-based hybrid filtering of that 
meta-data. 

Keywords: MPEG-7, filtering, meta-data, modelling, multimedia 



15 Invited workshop on middleware interoperability of enterprise applications: A marriage ^ 
of Web services and reflective middleware to solve the problem of mobile client 
interoperability 

Paul Grace, Gordon Blair, Sam Samuel 

September 2003 Proceedings of the 1st international symposium on Information and 
communication technologies 

Full text available: ^ pdfd 80.79 KB) Additional Information: full citation , abstract , references 

Mobile client applications must discover and interoperate with application services available 
to them at their present location. However, these services will be developed upon a range 
of middleware types (e.g. RMI and publish-subscribe) and advertised using different service 
discovery protocols (e.g. UPnP and SLP) unknown to the application developer. Therefore, a 
middleware platform supporting mobile client applications should ideally adapt its behaviour 
to interoperate with any type of discove ... 



® Inside risks: Believing in my ths 
Marcus J. Ranum 

January 2004 Communications of the ACM, volume 47 issue i 

Full text available: ^^df ( 46.43 KB)„il ^^^.^.^^^^ information: full citation , index terms 
html(7.60 KB) 



Industrial/g overnme nt track: The anatomy of a multimodal infornnation filter 
■ Yi-Leh Wu, King-Shy Goh, Beitao Li, Huaxing You, Edward Y. Chang 
August 2003 Proceedings of the ninth ACM SIGKDD international conference on 
Knowledge discovery and data mining 

Full text available: pdf(477.83 KB) Additional Information: full citation , abstract, references , index terms 

The proliferation of objectionable information on the Internet has reached a level of serious 
concern. To empower end-users with the choice of blocking undesirable and offensive 
websites, we propose a multimodal information filter, named MORF, In this paper, we 
present MORF's core components: its confidence-based classifier, a Cross-bagging 
ensemble scheme, and multimodal classification algorithm. Empirical studies and initial 
statistics collected from the A/0/?F filters depto ... 

Keywords: content filtering, objectionable Web-site filtering 



18 New products 
Linux Journal Staff 

October 2003 Linux J urnal, Volume 2003 Issue 114 

Full text available: [Si html(8.16 KB) Additional Information: full citation 
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1 9 Challenges in information retrieval and language modeling: report of a workshop held H 
at the center for intelligent information retrieval. University of Massachusetts Amherst, 
September 2002 

James Allan, Jay Aslann, Nicholas Belkin, Chris Buckley, Jamie Callan, Bruce Croft, Sue Dumais, 
Norbert Fuhr, Donna Harman, David J. Harper, Djoerd Hiemstra, Thomas Hofmann, Eduard 
Hovy, Wessel Kraaij, John Lafferty, Victor Lavrenko, David Lewis, Liz Liddy, R. Manmatha, 
Andrew McCallum, Jay Ponte, John Prager, Dragomir Radev, Philip Resnik, Stephen Robertson, 
Roni Rosenfeld, Salim Roukos, Mark Sanderson, Rich Schwartz, Amit Singhal, Alan Smeaton, 
Howard Turtle, Ellen Voorhees, Ralph Weischedel, Jinxi Xu, ChengXiang Zhai 
April 2003 ACM SIGIR Forum, Volume 37 issue 1 

Full text available: ^ pdfd.SO MB) Additional Information: full citation , citing s, index terms , review 



Position papers: A delay-tolerant network architecture for challen ged intern ets 
Kevin Fall 

August 2003 Proceedings of the 2003 conference on Applications, technologies, 
architectures, and protocols for computer communications 

Full text available- 4il pdfd 00.02 KB) Additional Information: fidLcitatign, abstract, references, citings, Index 

terms 

The highly successful architecture and protocols of today's Internet may operate poorly in 
environments characterized by very long delay paths and frequent network partitions. 
These problems are exacerbated by end nodes with limited power or memory resources. 
Often deployed in mobile and extreme environments lacking continuous connectivity, many 
such networks have their own specialized protocols, and do not utilize IP. To achieve 
interoperability between them, we propose a network architecture a ... 
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Audio enriched links: v^eb page previews for blind users 
Peter Parente 

September 2003 ACM SIGACCESS Accessibility and Computing , Proceedings of the ACM 

SIGACCESS conference on Computers and accessibility, issue 77-78 
Full text available: ^ pdf(229.96 KB) Additional Information: full citation , abstract , references , index terms 

Audio Enriched Links provide previews of linked web pages to users with visual 
impairments. Before a user follows a hyperlink, the Audio Enriched Links software presents 
a spoken summary of the next page including its title, its relation to the current page, 
statistics about its content, and some highlights from its content. We believe that such a 
summary may be a useful surrogate for a full web page, and help users with visual 
impairments decide whether or not to spend time visiting a linked ... 



Keywords: accessibility, speech preview, visual impairment, web page preview 



Summary-based routing for content-based event distribution networks 

Yi-MIn Wang, Lili Qiu, Chad Verbowski, Dimitris Achlioptas, Gautam Das, Paul Larson 

October 2004 ACM SIGCOMM Computer Communication Review, volume 34 issue 5 

Full text available: ^ pdf(2.82 MB) Additional Information: full citation , abstract , references 

Providing scalable distributed Web-based eventing services has been an important research 
topic. It is desirable to have an effective mechanism for the servers to summarize their 
filters for in-network preprocessing in order to optimize system performance. In this paper, 
we propose a summary-based routing mechanism and introduce the notion of imprecise 
summaries to provide a trade-off between routing overhead and event traffic. Our system 
uses similarity-based filter clustering to reduce overall ... 

Findin g and preventin g run-time error handlin g mistakes 
Westley Weimer, George C. Necula 

October 2004 ACM SIGPLAN Notices , Proceedings of the 19th annual ACM SIGPLAN 
Conference on Object-oriented programming, systems, languages, and 
applicati ns, Volume 39 issue lO . 

Full text available: ^ pdf(275.01 KB) Additional Information: full citation , abstract , references , index terms 

It is difficult to write programs that behave correctly in the presence of run-time errors. 
Existing programming language features often provide poor support for executing clean-up 
code and for restoring Invariants in such exceptional situations. We present a dataflow 
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analysis for finding a certain class of error-handling mistakes; those that arise from a 
failure to release resources or to clean up properly along all paths. Many real-world 
programs violate such resource safety policies becaus ... 

Keywords: dataflow, destructors, exceptions, finalizers, try-finally 



4 Pa per session 1: web querying and mining: Spann. damn spam, and statistics: using 
statistical analysis to locate spam web pa g es 
Dennis Fetterly, Mark Manasse, Marc Najork 

June 2004 Proceedings of the 7th International Workshop on the Web and Databases: 
colocated with ACM SIGMOD/PODS 2004 

Full text available: ^ pdf(791.70 KB) Additional Information: full citation , abstract , references , index terms 

The increasing innportance of search engines to commercial web sites has given rise to a 
phenomenon we call "web spam", that is, web pages that exist only to mislead search 
engines into (mis)leading users to certain web sites. Web spam is a nuisance to users as 
well as search engines: users have a harder time finding the information they need, and 
search engines have to cope with an inflated corpus, which in turn causes their cost per 
query to increase. Therefore, search engines have a strong inc ... 

Keywords: statistical properties of web pages, web characterization, web spam 



Calculatin g error rates for filtering software 

Paul J. Resnick, Derek L. Hansen, Caroline R. Richardson 

September 2004 Communications of the ACI^, volume 47 issue 9 

Full text available: IB pdf(134.44 KB) 

i/^r^ ri^~.xm^ AddltioHal Information: full citation , abstract , references , index terms 
[gl html(25.90 KB) — 

Surveys in the U.S. have found that 95% of schools [4], 43% of public libraries [5], and 
33% of teenagers' parents [8] employ filtering software to block access to pornography and 
other inappropriate content. Many products are also now available to filter out spam 
email. Filtering software, however, cannot perfectly discriminate between allowed and 
forbidden content, resulting in two types of errors. First, under-blocking occurs when 
content is not blocked that should be restricted. Second, over- ... 

Impact of configuration errors on DNS robustness 

Vasileios Pappas, Zhiguo Xu, Songwu Lu, Daniel Massey, Andreas Terzis, Lixia Zhang 
August 2004 ACM SIGCOI^M Computer Communication Review , Proceedings of the 
2004 conference on Applications, technologies, architectures, and 
protocols for computer communications, volume 34 issue 4 
Full text available' pdf(327 1 3 KB) Additional Information: full citation , abstract , references, citing s, index 
' terms 

During the past twenty years the Domain Name System (DNS) has sustained phenomenal 
growth while maintaining satisfactory performance. However, the original design focused 
mainly on system robustness against physical failures, and neglected the impact of 
operational errors such as misconfigurations. Our recent measurement effort revealed three 
specific types of misconfigurations in DNS today: lame delegation, diminished server 
redundancy, and cyclic zone dependency. Zones with configuration error ... 

Keyw rds: DNS, misconfigurations, resiliency 
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Lynette Barnard, Janet Wesson 

October 2004 Proceedings of the 2004 annual research c nference of the South African 
institute f c mputer scientists and inf rmati n techn I gists n IT 
research in devel ping c untries 

Full text available: ^ pdf(82.14 KB) Additional Information: full citation , abstract , references , index terms 

In this paper, the relative innportance of trust and usability of e-connnnerce in South Africa is 
discussed. In order to investigate these Issues, a heuristic evaluation and an empirical 
evaluation were conducted on a number of South African e-commerce sites. A 
comprehensive set of e-commerce design guidelines was compiled, which was used to 
conduct a heuristic evaluation of the selected e-commerce sites. The results of the heuristic 
evaluation indicated that a number of usability problems exist ... 

Keywords: South Africa, e-commerce, e-commerce design guidelines, empirical 
evaluation, heuristic evaluation, trust, trust model, usability 



A digital library connponent assembly environment 
Linda Eyambe, Hussein Suleman 

October 2004 Proceedings of the 2004 annual research conference of the South African 
institute of computer scientists and information technologists on IT 
research in developing countries 

Full text available: ^ pdfd 15.23 KB) Additional Information: full citation , abstract , references , index terms 

With the advent of the Internet came the promise of global Information access. In keeping 
with this promise. Digital Libraries (DLs) began to emerge across the world as a method of 
providing structured information to their users. These DLs are often created using 
proprietary monolithic software that is often difficult to customise and extend. The Open 
Digital Library (ODL) project was created to demonstrate that DLs can be built as a network 
of components instead of as monolithic systems. Alt ... 

Keywords: components, design, digital libraries, experimentation, graphical user interface, 
open digital libraries, standardization 



9 Cases from the field: Field studies of computer system administrators: analysis of 
system management tools and practices 

Rob Barrett, Eser Kandogan, Paul P. Maglio, Eben M. Haber, Leila A. Takayama, Madhu 
Prabaker 

November 2004 Proceedings of the 2004 ACM conference on Computer supported 
cooperative work 

Full text available: ^ pdf(405.09 KB) Additional Information: full citation , abstract , references , index terms 

Computer system administrators are the unsung heroes of the information age, working 
behind the scenes to configure, maintain, and troubleshoot the computer infrastructure that 
underlies much of modern life. However, little can be found in the literature about the 
practices and problems of these highly specialized computer users. We conducted a series 
of field studies in large corporate data centers, observing organizations, work practices, 
tools, and problem-solving strategies of system admi ... 

Keywords: collaboration, command-line interfaces, ethnography, situation awareness, 
system administration 



10 Knowledge sharing in software engineering: Group awareness in distributed software 
development 

Carl Gutwin, Reagan Penner, Kevin Schneider 
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November 2004 Proceedings f the 2004 ACM c nference nC mputersupp rted 
c operative work 

Full text available: ^ pdf(365.27 KB) Additional Information: full citation , abstract , references , index terms 

Open-source software developnnent projects are ainnost always collaborative and 
distributed. Despite the difficulties imposed by distance, these projects have nnanaged to 
produce large, complex, and successful systems. However, there Is still little known about 
how open-source teams manage their collaboration. In this paper we look at one aspect of 
this issue: how distributed developers maintain group awareness. We interviewed 
developers, read project communication, and looked at project artifac ... 

Keywords: OSS, collaborative software development, group awareness 



11 Innovative systemic perspectives: Effective work practices for software engineering: 
free/libre open source software developnnent 
Kevin Crowston, Hala Annabi, James Howison, Chengetai Masango 

November 2004 Proceedings of the 2004 ACM workshop on Interdisciplinary software 
engineering research 

Full text available: ^ pdf(390.40 KB) Additional Information: full citation , abstract , references , index terms 

We review the literature on Free/Libre Open Source Software (FLOSS) development and on 
software development, distributed work and teams more generally to develop a theoretical 
model to explain the performance of FLOSS teams. The proposed model is based on 
Hackman's [34] model of effectiveness of work teams, with coordination theory [52] and 
collective mind [79] to extend Hackman's model by elaborating team practices relevant to 
effectiveness in software development. We propose a set of propos ... 

Keywords: collective mind theory, coordination theory, free and open source software, 
team effectiveness 



Searching for the n eedle in th e haystack: taxonomies, ta gs and targ ets 
Michael Pelikan, James Leous, Richard Pearce, Margaret E. Smith, Russell Vaught 
October 2004 Proceedings of the 32nd annual ACM SIGUCCS conference on User 
services 

Full text available: ^ pdfn65.65 KB) Additional Information: full citation , abstract , references, index terms 

The Penn State Taxonomic Tags group, with representatives from Information Technology, 
Business Administration, and the Penn State Libraries, was formed to examine whether a 
taxonomic set of tags, systematically applied across the university's Web pages, could (a) 
make finding specific pages easier from among the University's greater than 500,000 Web 
pages, (b) simplify Web content management tasks and (c) prove useful over time as 
search engines continue to evolve and despite whether open so ... 

Keywords: content management systems, controlled vocabularies, metadata, taxonomies, 
web search engines 



13 From yellow stickies to the world-wide web: the evolution of problem trackin g at the 
University of Houston 
Julia Kosatka, Anita Bhakta 

October 2004 Proceedings f the 32nd annual ACM SIGUCCS c nference n User 
services 

Full text available: ^ pdf(223.77 KB) Additional Information: full citation , abstract , references , index terms 
In 1990, IT Technology Support Services (TSS) was formed by combining several IT 
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support departments. Cases were distributed to the four or five support people by the 
simple expedient of putting sticky notes on their office doors. A support person would 
return from an office call to find his/her office door covered in sticky notes. Missing cases, 
lost phone numbers and angry customers were common events. With an enrollment of 
30,000 students and rising, something had to give. 

A variety ... 

Keywords: RightNowTechnologies, burnout, collaboration, console, e-mail, fileMaker pro, 
helpdesk, notification system, remedy, self-service, tracking, web 
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Knowledge discovery and data mining 

Full text available: ^ pdf(747.27 KB) Additional Information: full citation , abstract, references . Index term s 

The primary goal of Web usage mining is the discovery of patterns in the navigational 
behavior of Web users. Standard approaches, such as clustering of user sessions and 
discovering association rules or frequent navigational paths, do not generally provide the 
ability to automatically characterize or quantify the unobservable factors that lead to 
common navigational patterns. It Is, therefore, necessary to develop techniques that can 
automatically discover hidden semantic relationships among use ... 

Keywords: PLSA, Web usage mining, user profiling 
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Full text available: ^pdf(192.19 KB) Additional Information: full citation , abstract , references . Index terms 

We present a Context Ultra-Sensitive Approach based on two-step Recommender systems 
(CUSA-2-step-Rec) , Our approach relies on a connmittee of profile-specific neural networks. 
This approach provides recommendations that are accurate and fast to train because only 
the URLs relevant to a specific profile are used to define the architecture of each network. 
We compare the proposed approach with collaborative filtering showing that our approach 
achieves higher coverage and precision while bein ... 

Keywords: collaborative filtering, neural networks, web mining 
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Full text available: ^Ddf(472.86 KB) Additional Information: full citation , abstract , references , index terms 

A high quality of free movement, or mobility, is key to the accessibility, design, and 
usability of many 'common-use' hypermedia resources (Web sites) and key to good mobility 
is context and preview. This is especially the case when a hypertext anchor is inaccurately 
described or is described out of context as confusion and disorientation can ensue. Mobility 
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is similarly reduced when the link target of the anchor has no relationship to the expected 
infornnation present on the hypertext node (Web ... 

Keyw rds: document engineering, evaluation, hypertext, web mobility 
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Full text available: ^ pdfd 60.60 KB) Additional Information: full citation , abstract , references , index terms 

We describe current efforts and developments building on our proposal for an XML log 
standard format for digital library (DL) logging analysis and companion tools. Focus is given 
to the evolution of formats and tools, based on analysis of deployment in several DL 
systems and testbeds. Recent development of analysis tools also is discussed. 

Technical papers: testing I: Improving web application testing with user session data 
Sebastian Elbaum, Srikanth Karre, Gregg Rothermel 

May 2003 Proceedings of the 25th International Conference on Software Engineering 

Full text available: ^ pdf n.lQ MB) Wi Additional Information: full citation , abstract , references , citings, index 
Publisher Site 

Web applications have become critical components of the global information infrastructure, 
and it is important that they be validated to ensure their reliability. Therefore, many 
techniques and tools for validating web applications have been created. Only a few of these 
techniques, however, have addressed problems of testing the functionality of web 
applications, and those that do have not fully considered the unique attributes of web 
applications. In this paper we explore the notion that user s ... 

Video and multimedia digital libraries: Virtual multimedia libraries built from the web 
Neil C. Rowe 

July 2002 Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries 

Full text available: ^ pdf(147.14 KB) Additional Information: full citation , abstract , references , index terms 

We have developed a tool MARIE-4 for building virtual libraries of multimedia (images, 
video, and audio) by automatically exploring (crawling) a specified subdomain of the World 
Wide Web to create an index based on caption keywords. Our approach uses carefully- 
researched criteria to identify and rate caption text, and employs both an expert system 
and a neural network. We have used it to create a keyword-based interface to nearly all 
nontrivial captioned publicly-accessible U.S. Navy images (667 ... 

Keywords: World Wide Web, audio, captions, images, information retrieval, libraries, 
multimedia, video 
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Full text available- I Sl pdf(274 89 KB) Additional Information: full citation, abstract , references , citings, index 
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Increasingly, digital libraries are being defined that collect pointers to World-Wide Web 
based resources rather than hold the resources themselves. Maintaining these collections is 
challenging due to distributed document ownership and high fluidity. Typically a collections 
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maintainer has to assess the relevance of changes with little system aid. In this paper, we 
describe the Waldens Paths Path Manager, which assists a maintainer in discovering when 
relevant changes occur to linked resour ... 

Keywords: Walden's path, path maintenance 
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